-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caddyhttp: Use sendfile(2) by implementing ReadFrom on wrappers #5022
Conversation
Awesome! I'm curious how much of a throughput benefit this will have for Caddy with this change! |
Holy shnikies -- I think it works:
Where index.html is just the Caddy homepage, and my Caddyfile is:
Bravo, @flga -- you've nearly doubled the performance of the file server over HTTP. Interestingly, I also see performance improvements with TLS:
which I would not expect to happen. But I was able to repeat these results. This might be good news for production sites. I still want to collect profiles during these tests to see where the differences are when sendfile() is used versus when it's not. I might do that later tonight or tomorrow. @flga Can I invite you to our developers Slack? It's for Caddy maintainers, sponsors, developers, etc. Would love to have you hang out with us there. Just let me know which email to send the invite to. |
Tried it out for myself as well! I can confirm essentially the same thing: HTTP Without patch:
HTTP With patch:
HTTPS Without patch:
HTTPS With patch:
|
Thanks for verifying, Francis! As Francis suggested in Slack, the improvements with TLS are probably due to ReadFrom() being an optimization in and of itself, as fewer buffers are allocated for the copying. It probably doesn't use sendfile but you still get some optimization with ReadFrom()! |
CPU profiles were almost identical. Heap profiles were also pretty similar, but with strikingly fewer allocations and copying across buffers in the patched version (as expected):
I agree, I just don't know how to do that... still, I feel like by now we've implemented the most crucial interfaces... I'm open for ideas on better ways to do this though. Thanks for the test case by the way! The code change looks good and is just about exactly how I would have done it. Better, even. I'll do a final review soon and maybe tweak a thing or two, but either way we'll get this merged very shortly. Thank you for this great patch! |
This has the green light to be merged any time -- so feel free. I am curious how this might affect reverse proxy (both the HTTP transport and the FastCGI transport), as I haven't had a chance to test those yet. If we need to add a ReadFrom or WriteTo somewhere, we can do that. Doesn't have to be in this PR; could come later. |
I tried with 2 Caddy instances, a |
Verified that this does not affect reverse proxy. (Which is fine, maybe a later patch can address that.) Our reverse proxy is derived from the std lib's Other differences between the two are primarily about error handling. The omission of this optimization in httputil appears to be intentional: golang/go#21814 However, just for kicks and giggles I added it back into ours to see what the difference is, even though error handling is "incorrect" when the optimization is used: func (h Handler) copyBuffer(dst io.Writer, src io.Reader, buf []byte) (int64, error) {
if wt, ok := src.(io.WriterTo); ok {
return wt.WriteTo(dst)
}
if rt, ok := dst.(io.ReaderFrom); ok {
return rt.ReadFrom(src)
}
// ... Proxying the same payload as my test above, I got about 35k-36k req/sec. With this patch and re-adding the ReadFrom call, I observed about 40k req/sec. Not a huge boost, but a repeatable result, and still an improvement. Go's sendfile support only works for TCP connections with an *os.File on one side, even though sendfile(2) supports any file descriptors. That seems unfortunate that we can't copy from one TCP socket to another (unless TCPConns do assert as *os.File? I doubt it?) -- maybe I'm wrong, but that's what it seems like. So anyway, if this patch were to affect reverse proxy, we'd just need to call ReadFrom in our copyBuffer. It doesn't use sendfile (unless we implement that manually -- which I'd be open to reviewing), but it does reduce allocations/copies. But then, we have the problem of less control over error handling, which is why httputil doesn't just use io.copyBuffer. Hmm. So I just thought I'd mention this here. This patch still LGTM! |
…From if the underlying response writer implements it. Doing so allows for splice/sendfile optimizations when available.
Hey those are some nice gains, I was expecting something more modest around the 10% mark but I won't complain 😄
So it kinda ends up in @mholt re: slack, I can dm you my email on twitter @flga_ |
@flga Sure, DM sent. Ah right, I forgot about the differences between sendfile and splice. Once we've written the headers from the backend, it might very well be possible to use splice to finish copying the response body. I'd love to review and test that! (I know it's HTTP-only but I bet we have more HTTP reverse proxies than HTTP file servers out there, since reverse proxying is very common on internal environments where TLS isn't needed.) |
Doing so allows for splice/sendfile optimizations when available. Since splice can now be leveraged as well, reverse proxy might also benefit tho I haven't confirmed it.
Fixes #4731
To check whether sendfile is really being called or not we can construct two files, any file over 512 bytes should use sendfile in http 1.
logs trimmed for brevity
Given this change works under the covers I've added some tests to ensure it won't break.
It would be nice if we could fix
ResponseWriterWrapper
in a less blunt way. Currently, if the underlying writer does not supportReadFrom
we callio.Copy
(again). Unfortunately we're fairly constrained on what we can do without breaking everything everywhere so this seemed the most reasonable, backwards compatible, option. The ideal solution would be to not rely on embedding aResponseWriterWrapper
but creating one at runtime that implements specifically and exclusively what the underlying writer exposes.ResponseRecorder
is now aware ofReadFrom
too, since it needs to spy on it to gather response size.